Overview

Dataset statistics

Number of variables20
Number of observations338592
Missing cells2716785
Missing cells (%)40.1%
Duplicate rows309474
Duplicate rows (%)91.4%
Total size in memory51.7 MiB
Average record size in memory160.0 B

Variable types

Numeric18
Categorical2

Warnings

Dataset has 309474 (91.4%) duplicate rows Duplicates
PayFukusyoPay1 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoNinki1 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoUmaban2 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoPay2 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoNinki2 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoUmaban3 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoPay3 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoNinki3 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoUmaban4 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoPay4 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoNinki4 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayFukusyoUmaban5 is highly correlated with PayFukusyoPay1 and 15 other fieldsHigh correlation
PayFukusyoNinki5 is highly correlated with PayFukusyoPay1 and 15 other fieldsHigh correlation
PayWakurenKumi1 is highly correlated with PayFukusyoUmaban5 and 2 other fieldsHigh correlation
PayWakurenPay1 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayWakurenNinki1 is highly correlated with PayFukusyoUmaban5 and 1 other fieldsHigh correlation
PayWakurenPay2 is highly correlated with PayWakurenNinki2High correlation
PayWakurenNinki2 is highly correlated with PayWakurenPay2High correlation
PayUmarenKumi1 is highly correlated with PayFukusyoUmaban5 and 2 other fieldsHigh correlation
PayFukusyoUmaban5 is highly correlated with PayFukusyoNinki5High correlation
PayFukusyoNinki5 is highly correlated with PayFukusyoUmaban5High correlation
PayFukusyoUmaban4 has 337672 (99.7%) missing values Missing
PayFukusyoPay4 has 337672 (99.7%) missing values Missing
PayFukusyoNinki4 has 337672 (99.7%) missing values Missing
PayFukusyoUmaban5 has 338561 (> 99.9%) missing values Missing
PayFukusyoNinki5 has 338561 (> 99.9%) missing values Missing
PayWakurenPay1 has 4958 (1.5%) missing values Missing
PayWakurenNinki1 has 4958 (1.5%) missing values Missing
PayWakurenKumi2 has 337965 (99.8%) missing values Missing
PayWakurenPay2 has 337965 (99.8%) missing values Missing
PayWakurenNinki2 has 337965 (99.8%) missing values Missing
PayWakurenKumi1 has 4958 (1.5%) zeros Zeros

Reproduction

Analysis started2021-04-07 12:55:50.540495
Analysis finished2021-04-07 12:57:26.639600
Duration1 minute and 36.1 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

PayFukusyoPay1
Real number (ℝ≥0)

HIGH CORRELATION

Distinct351
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291.9809978
Minimum100
Maximum11030
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum100
5-th percentile110
Q1140
median180
Q3290
95-th percentile800
Maximum11030
Range10930
Interquartile range (IQR)150

Descriptive statistics

Standard deviation400.493873
Coefficient of variation (CV)1.371643621
Kurtosis122.7952143
Mean291.9809978
Median Absolute Deviation (MAD)60
Skewness8.540567827
Sum98862430
Variance160395.3423
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11040548
 
12.0%
14022087
 
6.5%
15021196
 
6.3%
13020677
 
6.1%
16018100
 
5.3%
12017757
 
5.2%
17015558
 
4.6%
18014495
 
4.3%
19013723
 
4.1%
20011530
 
3.4%
Other values (341)142921
42.2%
ValueCountFrequency (%)
1002606
 
0.8%
11040548
12.0%
12017757
5.2%
13020677
6.1%
14022087
6.5%
ValueCountFrequency (%)
1103012
< 0.1%
1003014
< 0.1%
1000016
< 0.1%
834014
< 0.1%
803012
< 0.1%

PayFukusyoNinki1
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.505277738
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q35
95-th percentile9
Maximum18
Range17
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.872211823
Coefficient of variation (CV)0.8193963608
Kurtosis1.987844545
Mean3.505277738
Median Absolute Deviation (MAD)2
Skewness1.464734818
Sum1186859
Variance8.249600753
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
1105030
31.0%
263158
18.7%
343399
12.8%
434055
 
10.1%
523781
 
7.0%
618878
 
5.6%
714706
 
4.3%
810848
 
3.2%
97946
 
2.3%
105528
 
1.6%
Other values (8)11263
 
3.3%
ValueCountFrequency (%)
1105030
31.0%
263158
18.7%
343399
12.8%
434055
 
10.1%
523781
 
7.0%
ValueCountFrequency (%)
1849
 
< 0.1%
1731
 
< 0.1%
16394
 
0.1%
15825
0.2%
141188
0.4%

PayFukusyoUmaban2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.935444429
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q14
median8
Q311
95-th percentile15
Maximum18
Range17
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.455406002
Coefficient of variation (CV)0.5614563926
Kurtosis-1.016068041
Mean7.935444429
Median Absolute Deviation (MAD)4
Skewness0.1835155081
Sum2686878
Variance19.85064265
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
424362
 
7.2%
623758
 
7.0%
523652
 
7.0%
223586
 
7.0%
823538
 
7.0%
323437
 
6.9%
923039
 
6.8%
722788
 
6.7%
1022715
 
6.7%
122263
 
6.6%
Other values (8)105454
31.1%
ValueCountFrequency (%)
122263
6.6%
223586
7.0%
323437
6.9%
424362
7.2%
523652
7.0%
ValueCountFrequency (%)
182166
 
0.6%
172040
 
0.6%
1612046
3.6%
1513419
4.0%
1416199
4.8%

PayFukusyoPay2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct413
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean362.8882844
Minimum100
Maximum14080
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum100
5-th percentile110
Q1150
median210
Q3370
95-th percentile1100
Maximum14080
Range13980
Interquartile range (IQR)220

Descriptive statistics

Standard deviation495.5267378
Coefficient of variation (CV)1.365507676
Kurtosis83.60845008
Mean362.8882844
Median Absolute Deviation (MAD)80
Skewness6.868840582
Sum122871070
Variance245546.7479
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11021375
 
6.3%
14019404
 
5.7%
15018534
 
5.5%
13017917
 
5.3%
16017167
 
5.1%
17016298
 
4.8%
18014299
 
4.2%
12013016
 
3.8%
19012865
 
3.8%
20011164
 
3.3%
Other values (403)176553
52.1%
ValueCountFrequency (%)
100951
 
0.3%
11021375
6.3%
12013016
3.8%
13017917
5.3%
14019404
5.7%
ValueCountFrequency (%)
140809
< 0.1%
1182014
< 0.1%
1045012
< 0.1%
1029014
< 0.1%
894014
< 0.1%

PayFukusyoNinki2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.33293167
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q36
95-th percentile11
Maximum18
Range17
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.182678145
Coefficient of variation (CV)0.7345322723
Kurtosis0.872067072
Mean4.33293167
Median Absolute Deviation (MAD)2
Skewness1.14855457
Sum1467096
Variance10.12944018
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
163584
18.8%
260491
17.9%
348375
14.3%
437968
11.2%
530472
9.0%
623683
 
7.0%
718685
 
5.5%
815062
 
4.4%
911753
 
3.5%
108655
 
2.6%
Other values (8)19864
 
5.9%
ValueCountFrequency (%)
163584
18.8%
260491
17.9%
348375
14.3%
437968
11.2%
530472
9.0%
ValueCountFrequency (%)
1851
 
< 0.1%
17187
 
0.1%
16612
 
0.2%
151323
0.4%
142462
0.7%

PayFukusyoUmaban3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.792360717
Minimum0
Maximum18
Zeros1418
Zeros (%)0.4%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile1
Q14
median7
Q311
95-th percentile15
Maximum18
Range18
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.507472884
Coefficient of variation (CV)0.5784476679
Kurtosis-1.014423756
Mean7.792360717
Median Absolute Deviation (MAD)4
Skewness0.214404183
Sum2638431
Variance20.3173118
MonotocityNot monotonic
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
324629
 
7.3%
224416
 
7.2%
624124
 
7.1%
123923
 
7.1%
423907
 
7.1%
523675
 
7.0%
823258
 
6.9%
723227
 
6.9%
1022059
 
6.5%
921645
 
6.4%
Other values (9)103729
30.6%
ValueCountFrequency (%)
01418
 
0.4%
123923
7.1%
224416
7.2%
324629
7.3%
423907
7.1%
ValueCountFrequency (%)
181994
 
0.6%
172458
 
0.7%
1611477
3.4%
1513978
4.1%
1416156
4.8%

PayFukusyoPay3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct483
Distinct (%)0.1%
Missing1418
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean443.4170191
Minimum100
Maximum16110
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum100
5-th percentile120
Q1170
median250
Q3460
95-th percentile1410
Maximum16110
Range16010
Interquartile range (IQR)290

Descriptive statistics

Standard deviation613.856985
Coefficient of variation (CV)1.384378494
Kurtosis70.07135294
Mean443.4170191
Median Absolute Deviation (MAD)100
Skewness6.191340491
Sum149508690
Variance376820.398
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14015326
 
4.5%
15014732
 
4.4%
16014292
 
4.2%
17013900
 
4.1%
13013100
 
3.9%
18012999
 
3.8%
11012449
 
3.7%
19012256
 
3.6%
20010739
 
3.2%
21010102
 
3.0%
Other values (473)207279
61.2%
ValueCountFrequency (%)
100489
 
0.1%
11012449
3.7%
12010071
3.0%
13013100
3.9%
14015326
4.5%
ValueCountFrequency (%)
1611013
< 0.1%
1439016
< 0.1%
117208
< 0.1%
1158012
< 0.1%
1073015
< 0.1%

PayFukusyoNinki3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct18
Distinct (%)< 0.1%
Missing1418
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean5.11755948
Minimum1
Maximum18
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q37
95-th percentile12
Maximum18
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.397833506
Coefficient of variation (CV)0.6639558407
Kurtosis0.2589021328
Mean5.11755948
Median Absolute Deviation (MAD)2
Skewness0.8851947252
Sum1725508
Variance11.54527254
MonotocityNot monotonic
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
245692
13.5%
344057
13.0%
143399
12.8%
439715
11.7%
534382
10.2%
629039
8.6%
724439
7.2%
819710
5.8%
916087
 
4.8%
1012375
 
3.7%
Other values (8)28279
8.4%
ValueCountFrequency (%)
143399
12.8%
245692
13.5%
344057
13.0%
439715
11.7%
534382
10.2%
ValueCountFrequency (%)
18157
 
< 0.1%
17239
 
0.1%
161320
 
0.4%
152328
0.7%
143561
1.1%

PayFukusyoUmaban4
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct15
Distinct (%)1.6%
Missing337672
Missing (%)99.7%
Infinite0
Infinite (%)0.0%
Mean11.06521739
Minimum3
Maximum18
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum3
5-th percentile5
Q18
median11
Q314
95-th percentile16
Maximum18
Range15
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.669728379
Coefficient of variation (CV)0.3316453937
Kurtosis-0.9756151074
Mean11.06521739
Median Absolute Deviation (MAD)3
Skewness-0.181465139
Sum10180
Variance13.46690637
MonotocityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
15116
 
< 0.1%
7113
 
< 0.1%
14113
 
< 0.1%
1284
 
< 0.1%
873
 
< 0.1%
1166
 
< 0.1%
1364
 
< 0.1%
1058
 
< 0.1%
958
 
< 0.1%
1653
 
< 0.1%
Other values (5)122
 
< 0.1%
(Missing)337672
99.7%
ValueCountFrequency (%)
38
 
< 0.1%
425
 
< 0.1%
539
 
< 0.1%
621
 
< 0.1%
7113
< 0.1%
ValueCountFrequency (%)
1829
 
< 0.1%
1653
< 0.1%
15116
< 0.1%
14113
< 0.1%
1364
< 0.1%

PayFukusyoPay4
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct39
Distinct (%)4.2%
Missing337672
Missing (%)99.7%
Infinite0
Infinite (%)0.0%
Mean411.1630435
Minimum110
Maximum2220
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum110
5-th percentile110
Q1140
median260
Q3470
95-th percentile1600
Maximum2220
Range2110
Interquartile range (IQR)330

Descriptive statistics

Standard deviation431.5915022
Coefficient of variation (CV)1.049684569
Kurtosis5.758382805
Mean411.1630435
Median Absolute Deviation (MAD)130
Skewness2.422839035
Sum378270
Variance186271.2247
MonotocityNot monotonic
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
18092
 
< 0.1%
11082
 
< 0.1%
13056
 
< 0.1%
14054
 
< 0.1%
41045
 
< 0.1%
12044
 
< 0.1%
54041
 
< 0.1%
29037
 
< 0.1%
15034
 
< 0.1%
24027
 
< 0.1%
Other values (29)408
 
0.1%
(Missing)337672
99.7%
ValueCountFrequency (%)
11082
< 0.1%
12044
< 0.1%
13056
< 0.1%
14054
< 0.1%
15034
< 0.1%
ValueCountFrequency (%)
222015
< 0.1%
175016
< 0.1%
160018
< 0.1%
146015
< 0.1%
13808
< 0.1%

PayFukusyoNinki4
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct14
Distinct (%)1.5%
Missing337672
Missing (%)99.7%
Infinite0
Infinite (%)0.0%
Mean6.15
Minimum1
Maximum15
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q39
95-th percentile13
Maximum15
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.729904444
Coefficient of variation (CV)0.6064885274
Kurtosis-1.030285916
Mean6.15
Median Absolute Deviation (MAD)3
Skewness0.34859224
Sum5658
Variance13.91218716
MonotocityNot monotonic
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3156
 
< 0.1%
1098
 
< 0.1%
286
 
< 0.1%
985
 
< 0.1%
483
 
< 0.1%
182
 
< 0.1%
671
 
< 0.1%
860
 
< 0.1%
1152
 
< 0.1%
543
 
< 0.1%
Other values (4)104
 
< 0.1%
(Missing)337672
99.7%
ValueCountFrequency (%)
182
< 0.1%
286
< 0.1%
3156
< 0.1%
483
< 0.1%
543
 
< 0.1%
ValueCountFrequency (%)
1515
 
< 0.1%
1333
 
< 0.1%
1224
 
< 0.1%
1152
< 0.1%
1098
< 0.1%

PayFukusyoUmaban5
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)6.5%
Missing338561
Missing (%)> 99.9%
Memory size2.6 MiB
11.0
18 
14.0
13 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters124
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row14.0
2nd row14.0
3rd row14.0
4th row14.0
5th row14.0
ValueCountFrequency (%)
11.018
 
< 0.1%
14.013
 
< 0.1%
(Missing)338561
> 99.9%
Histogram of lengths of the category
ValueCountFrequency (%)
11.018
58.1%
14.013
41.9%

Most occurring characters

ValueCountFrequency (%)
149
39.5%
.31
25.0%
031
25.0%
413
 
10.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number93
75.0%
Other Punctuation31
 
25.0%

Most frequent character per category

ValueCountFrequency (%)
149
52.7%
031
33.3%
413
 
14.0%
ValueCountFrequency (%)
.31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common124
100.0%

Most frequent character per script

ValueCountFrequency (%)
149
39.5%
.31
25.0%
031
25.0%
413
 
10.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII124
100.0%

Most frequent character per block

ValueCountFrequency (%)
149
39.5%
.31
25.0%
031
25.0%
413
 
10.5%

PayFukusyoNinki5
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)6.5%
Missing338561
Missing (%)> 99.9%
Memory size2.6 MiB
2.0
18 
1.0
13 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters93
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0
ValueCountFrequency (%)
2.018
 
< 0.1%
1.013
 
< 0.1%
(Missing)338561
> 99.9%
Histogram of lengths of the category
ValueCountFrequency (%)
2.018
58.1%
1.013
41.9%

Most occurring characters

ValueCountFrequency (%)
.31
33.3%
031
33.3%
218
19.4%
113
14.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number62
66.7%
Other Punctuation31
33.3%

Most frequent character per category

ValueCountFrequency (%)
031
50.0%
218
29.0%
113
21.0%
ValueCountFrequency (%)
.31
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common93
100.0%

Most frequent character per script

ValueCountFrequency (%)
.31
33.3%
031
33.3%
218
19.4%
113
14.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII93
100.0%

Most frequent character per block

ValueCountFrequency (%)
.31
33.3%
031
33.3%
218
19.4%
113
14.0%

PayWakurenKumi1
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct37
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.5166454
Minimum0
Maximum88
Zeros4958
Zeros (%)1.5%
Memory size2.6 MiB

Quantile statistics

Minimum0
5-th percentile13
Q123
median37
Q356
95-th percentile78
Maximum88
Range88
Interquartile range (IQR)33

Descriptive statistics

Standard deviation20.48558546
Coefficient of variation (CV)0.51840396
Kurtosis-0.7857305299
Mean39.5166454
Median Absolute Deviation (MAD)18
Skewness0.3596836582
Sum13380020
Variance419.6592118
MonotocityNot monotonic
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
7815974
 
4.7%
6714273
 
4.2%
6813834
 
4.1%
5713686
 
4.0%
5813497
 
4.0%
5612890
 
3.8%
4812186
 
3.6%
4711662
 
3.4%
4511496
 
3.4%
3711220
 
3.3%
Other values (27)207874
61.4%
ValueCountFrequency (%)
04958
1.5%
111767
 
0.5%
129681
2.9%
139330
2.8%
149505
2.8%
ValueCountFrequency (%)
884636
 
1.4%
7815974
4.7%
773933
 
1.2%
6813834
4.1%
6714273
4.2%

PayWakurenPay1
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct1529
Distinct (%)0.5%
Missing4958
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean2145.96426
Minimum120
Maximum98640
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum120
5-th percentile300
Q1600
median1110
Q32290
95-th percentile7450
Maximum98640
Range98520
Interquartile range (IQR)1690

Descriptive statistics

Standard deviation3460.906262
Coefficient of variation (CV)1.612751119
Kurtosis116.0057631
Mean2145.96426
Median Absolute Deviation (MAD)630
Skewness7.71790757
Sum715966640
Variance11977872.16
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4502590
 
0.8%
4802559
 
0.8%
5002555
 
0.8%
4402539
 
0.7%
5602528
 
0.7%
5802500
 
0.7%
3802492
 
0.7%
5502451
 
0.7%
6102445
 
0.7%
5702441
 
0.7%
Other values (1519)308534
91.1%
(Missing)4958
 
1.5%
ValueCountFrequency (%)
12024
 
< 0.1%
13040
 
< 0.1%
14055
 
< 0.1%
150156
< 0.1%
160245
0.1%
ValueCountFrequency (%)
9864014
< 0.1%
9121015
< 0.1%
8611013
< 0.1%
7183014
< 0.1%
7097014
< 0.1%

PayWakurenNinki1
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct36
Distinct (%)< 0.1%
Missing4958
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean6.922196179
Minimum1
Maximum36
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q310
95-th percentile21
Maximum36
Range35
Interquartile range (IQR)8

Descriptive statistics

Standard deviation6.626961752
Coefficient of variation (CV)0.9573496013
Kurtosis1.877255488
Mean6.922196179
Median Absolute Deviation (MAD)3
Skewness1.495687857
Sum2309480
Variance43.91662206
MonotocityNot monotonic
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
164325
19.0%
242681
12.6%
333261
9.8%
425648
 
7.6%
521377
 
6.3%
618703
 
5.5%
716274
 
4.8%
813724
 
4.1%
911524
 
3.4%
109961
 
2.9%
Other values (26)76156
22.5%
ValueCountFrequency (%)
164325
19.0%
242681
12.6%
333261
9.8%
425648
 
7.6%
521377
 
6.3%
ValueCountFrequency (%)
3631
 
< 0.1%
3589
 
< 0.1%
34157
 
< 0.1%
33370
0.1%
32556
0.2%

PayWakurenKumi2
Real number (ℝ≥0)

MISSING

Distinct23
Distinct (%)3.7%
Missing337965
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean48.33652313
Minimum16
Maximum78
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum16
5-th percentile18
Q135
median47
Q358
95-th percentile78
Maximum78
Range62
Interquartile range (IQR)23

Descriptive statistics

Standard deviation17.80189142
Coefficient of variation (CV)0.3682906892
Kurtosis-0.8061137841
Mean48.33652313
Median Absolute Deviation (MAD)11
Skewness0.08104622778
Sum30307
Variance316.9073381
MonotocityNot monotonic
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
7886
 
< 0.1%
4676
 
< 0.1%
4749
 
< 0.1%
5537
 
< 0.1%
5836
 
< 0.1%
3732
 
< 0.1%
6729
 
< 0.1%
3429
 
< 0.1%
5726
 
< 0.1%
5626
 
< 0.1%
Other values (13)201
 
0.1%
(Missing)337965
99.8%
ValueCountFrequency (%)
1617
< 0.1%
1714
< 0.1%
1812
< 0.1%
2414
< 0.1%
2626
< 0.1%
ValueCountFrequency (%)
7886
< 0.1%
6821
 
< 0.1%
6729
 
< 0.1%
5836
< 0.1%
5726
 
< 0.1%

PayWakurenPay2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct42
Distinct (%)6.7%
Missing337965
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean994.9760766
Minimum160
Maximum4810
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum160
5-th percentile200
Q1320
median560
Q31080
95-th percentile3190
Maximum4810
Range4650
Interquartile range (IQR)760

Descriptive statistics

Standard deviation1032.469569
Coefficient of variation (CV)1.037682808
Kurtosis3.460746383
Mean994.9760766
Median Absolute Deviation (MAD)300
Skewness1.961916789
Sum623850
Variance1065993.41
MonotocityNot monotonic
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
32057
 
< 0.1%
24026
 
< 0.1%
37026
 
< 0.1%
26024
 
< 0.1%
43024
 
< 0.1%
56023
 
< 0.1%
100017
 
< 0.1%
36016
 
< 0.1%
54016
 
< 0.1%
288015
 
< 0.1%
Other values (32)383
 
0.1%
(Missing)337965
99.8%
ValueCountFrequency (%)
16014
< 0.1%
19012
< 0.1%
20013
< 0.1%
24026
< 0.1%
26024
< 0.1%
ValueCountFrequency (%)
481015
< 0.1%
354011
< 0.1%
319010
< 0.1%
317011
< 0.1%
300010
< 0.1%

PayWakurenNinki2
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct20
Distinct (%)3.2%
Missing337965
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean7.244019139
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q39
95-th percentile24
Maximum28
Range27
Interquartile range (IQR)7

Descriptive statistics

Standard deviation6.964958512
Coefficient of variation (CV)0.9614770998
Kurtosis1.626971535
Mean7.244019139
Median Absolute Deviation (MAD)3
Skewness1.529543177
Sum4542
Variance48.51064708
MonotocityNot monotonic
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
195
 
< 0.1%
286
 
< 0.1%
381
 
< 0.1%
977
 
< 0.1%
450
 
< 0.1%
639
 
< 0.1%
825
 
< 0.1%
723
 
< 0.1%
520
 
< 0.1%
2615
 
< 0.1%
Other values (10)116
 
< 0.1%
(Missing)337965
99.8%
ValueCountFrequency (%)
195
< 0.1%
286
< 0.1%
381
< 0.1%
450
< 0.1%
520
 
< 0.1%
ValueCountFrequency (%)
2815
< 0.1%
2615
< 0.1%
2410
< 0.1%
2310
< 0.1%
2011
< 0.1%

PayUmarenKumi1
Real number (ℝ≥0)

HIGH CORRELATION

Distinct153
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean549.9749994
Minimum102
Maximum1718
Zeros0
Zeros (%)0.0%
Memory size2.6 MiB

Quantile statistics

Minimum102
5-th percentile106
Q1214
median507
Q3811
95-th percentile1215
Maximum1718
Range1616
Interquartile range (IQR)597

Descriptive statistics

Standard deviation358.6424735
Coefficient of variation (CV)0.6521068666
Kurtosis-0.3568882159
Mean549.9749994
Median Absolute Deviation (MAD)296
Skewness0.6847124509
Sum186217135
Variance128624.4238
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2034381
 
1.3%
2044275
 
1.3%
4054220
 
1.2%
1024151
 
1.2%
4064082
 
1.2%
2064064
 
1.2%
3044045
 
1.2%
1054032
 
1.2%
1043990
 
1.2%
4073943
 
1.2%
Other values (143)297409
87.8%
ValueCountFrequency (%)
1024151
1.2%
1033864
1.1%
1043990
1.2%
1054032
1.2%
1063424
1.0%
ValueCountFrequency (%)
1718325
0.1%
1618246
0.1%
1617223
0.1%
1518368
0.1%
1517207
0.1%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

PayFukusyoPay1PayFukusyoNinki1PayFukusyoUmaban2PayFukusyoPay2PayFukusyoNinki2PayFukusyoUmaban3PayFukusyoPay3PayFukusyoNinki3PayFukusyoUmaban4PayFukusyoPay4PayFukusyoNinki4PayFukusyoUmaban5PayFukusyoNinki5PayWakurenKumi1PayWakurenPay1PayWakurenNinki1PayWakurenKumi2PayWakurenPay2PayWakurenNinki2PayUmarenKumi1
0270.005.0015160.001.005190.002.00nannannannannan881,450.006.00nannannan1516
1120.001.009170.004.001140.002.00nannannannannan56690.002.00nannannan809
2110.001.008140.002.005180.005.00nannannannannan48310.001.00nannannan408
3210.001.00161,990.0017.006410.006.00nannannannannan78970.002.00nannannan1516
4270.005.0015160.001.005190.002.00nannannannannan881,450.006.00nannannan1516
5140.001.0012620.008.0016140.002.00nannannannannan561,350.006.00nannannan912
6300.004.008570.008.0012770.009.00nannannannannan152,430.0011.00nannannan108
7110.001.006150.002.0071,700.0014.00nannannannannan38320.001.00nannannan616
8360.007.00161,750.0013.00124,170.0015.00nannannannannan785,940.0017.00nannannan1416
9110.001.008210.003.0011240.004.00nannannannannan47550.001.00nannannan814

Last rows

PayFukusyoPay1PayFukusyoNinki1PayFukusyoUmaban2PayFukusyoPay2PayFukusyoNinki2PayFukusyoUmaban3PayFukusyoPay3PayFukusyoNinki3PayFukusyoUmaban4PayFukusyoPay4PayFukusyoNinki4PayFukusyoUmaban5PayFukusyoNinki5PayWakurenKumi1PayWakurenPay1PayWakurenNinki1PayWakurenKumi2PayWakurenPay2PayWakurenNinki2PayUmarenKumi1
338582110.001.005120.002.008350.005.00nannannannannan13280.002.00nannannan205
338583110.001.005120.002.008350.005.00nannannannannan13280.002.00nannannan205
338584340.004.0013340.005.001180.002.00nannannannannan171,510.006.00nannannan213
338585680.007.0015960.0011.006280.004.00nannannannannan471,350.005.00nannannan815
338586180.003.002160.001.0012250.005.00nannannannannan23700.002.00nannannan204
338587140.002.008180.003.001590.009.00nannannannannan78490.003.00nannannan810
338588180.003.006190.004.002150.001.00nannannannannan57650.002.00nannannan610
338589130.001.0014140.002.006180.004.00nannannannannan47530.001.00nannannan714
338590220.002.009540.007.004910.0011.00nannannannannan463,820.0014.00nannannan509
338591170.001.009190.003.00142,050.0014.00nannannannannan551,190.004.00nannannan910